Language Independent NER using a Maximum Entropy Tagger
نویسندگان
چکیده
Named Entity Recognition (NER) systems need to integrate a wide variety of information for optimal performance. This paper demonstrates that a maximum entropy tagger can effectively encode such information and identify named entities with very high accuracy. The tagger uses features which can be obtained for a variety of languages and works effectively not only for English, but also for other languages such as German and Dutch.
منابع مشابه
Recognizing named entities in spoken Chinese dialogues with a character-level maximum entropy tagger
Named Entity Recognition (NER) is an important task in information extraction, where major attention has been paid to written texts of a news or academic paper (esp. biomedical) style. In this paper we report the first piece of work on NER in spoken Chinese dialogues, as a preliminary step for spoken language understanding. The NER task is taken as a sequential classification problem and solved...
متن کاملVoted NER System using Appropriate Unlabeled Data
This paper reports a voted Named Entity Recognition (NER) system with the use of appropriate unlabeled data. The proposed method is based on the classifiers such as Maximum Entropy (ME), Conditional Random Field (CRF) and Support Vector Machine (SVM) and has been tested for Bengali. The system makes use of the language independent features in the form of different contextual and orthographic wo...
متن کاملLanguage Independent Named Entity Recognition
The role of Internet in personal, economic and political advancement is growing in a fast pace. By the turn of century, data on web reaches to petabytes or exabytes or may even scale up-to unimaginable quantities. Extraction of precise and structured information from such large amounts of unstructured or semi-structured data is the major concern of web known as Information Extraction. Named ent...
متن کاملMaximum Entropy Part-of-Speech Tagging in NLTK
In this paper we implement a part of speech tagger for NLTK using maximum entropy methods. Our tagger can be used as a drop-in replacement for any of the other NLTK taggers. We give a brief tutorial on how to use our tagger as well as describing the implementation at a high level. We evaluate our tagger on the Penn Tree Bank and compare our results to those of previous work.
متن کاملThe Tanl Tagger for Named Entity Recognition on Transcribed Broadcast News at Evalita 2011
The Tanl tagger is a flexible sequence labeller based on Conditional Markov Model that can be configured to use different classifiers and to extract features according to feature templates expressed through patterns provided in a configuration file. The Tanl Tagger was applied to the task of Named Entity Recognition (NER) on Transcribed Broadcast News of Evalita 2011. The goal of the task was t...
متن کامل